Specialising Paragraph Vectors for Text Polarity Detection
نویسنده
چکیده
This paper presents some experiments for specialising Paragraph Vectors, a new technique for creating text fragment (phrase, sentence, paragraph, text, ...) embedding vectors, for text polarity detection. The first extension regards the injection of polarity information extracted from a polarity lexicon into embeddings and the second extension aimed at inserting word order information into Paragraph Vectors. These two extensions, when training a logistic-regression classifier on the combined embeddings, were able to produce a relevant gain in performance when compared to the standard Paragraph Vector methods proposed by Le and Mikolov (2014).
منابع مشابه
Bayesian Paragraph Vectors
Word2vec (Mikolov et al., 2013b) has proven to be successful in natural language processing by capturing the semantic relationships between different words. Built on top of single-word embeddings, paragraph vectors (Le and Mikolov, 2014) find fixed-length representations for pieces of text with arbitrary lengths, such as documents, paragraphs, and sentences. In this work, we propose a novel int...
متن کاملImproving Paragraph2Vec
Paragraph vectors were proposed as a powerful unsupervised method of learning representations of arbitrary lengths of text. Although paragraph vectors had the advantage of being versatile, being unsupervised and unconstrained by lengths of text, the concept has not been further developed since its first publication. We propose two extensions upon the initial formulation of the paragraph vector,...
متن کاملFrom Paragraphs to Vectors and Back Again
I investigate some methods of encoding text into vectors and decoding these vector representations. The purpose of decoding vector representations is two fold. Firstly, I could apply unsupervised learning algorithms to the paragraph vectors to find significant ”new” vectors and decode them into paragraphs of text. Effectively, I could process text and generate ”new” ideas. Secondly, I could dec...
متن کاملDocument Embeddings for Arabic Sentiment Analysis
Research and industry are more and more focusing in finding automatically the polarity of an opinion regarding a specific subject or entity. Paragraph vector has been recently proposed to learn embeddings which are leveraged for English sentiment analysis. This paper focuses on Arabic sentiment analysis and investigates the use of paragraph vector within a machine learning techniques to determi...
متن کاملMicroblog Sentiment Analysis Based on Paragraph Vectors
Microblog sentiment analysis aims at discovering the users’ attitude of hot events. Difficulties of microblog sentiment analysis lie on the short length of text and lack of labeled corpora. Para2vec based on deep learning attracts people's attention recently and the low-dimensional paragraph vectors trained by para2vec get excellent results on sentiment analysis. But when applying it for sentim...
متن کامل